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ABSTRACT 

In  this  paper,  the  detection  and  estimation  of  change  points  of  local  parameters  are 
studied  by  means  of  localization  procedures  and  rank  statistics.  These  techniques  are  also 
applied  to  detection  and  estimation  of  the  change  points  of  scale  parameters  and  that  of 
location  parameters  of  directional  data. 


I.  INTRODUCTION 


'  Change  point  problem  arises  in  many  fields  and  attracts  the  attention  of  many 
authors.  The  techniques  employed  to  detect  and  estimate  the  change  point  can  generally  be 
classified  into  two  categories  :  parametric^  Krishnaiah  and  Miao  .1988  ),  and 

J 


nonparametric,(Csbrgo  and  Horvath,  1988).  Bayesian  methods  also  plays  a  major  role  , 
(Zacks,  1983 ).  '''  '  • 

In  this  paper,  we  concentrate  -ear  attention  on  the  problem  of  detection  of  change 
points  of  location  parameter  by  localization  and  rank  statistics  when  data  are  large.  Our 
method  is  different  from,  and  has  some  advantages  over,  the  existing  methods,  such  as 
CUSUM  (cumulative  sum)  and  Csorgo  and  Horvath’s  non-sequential  nonparametric 
AMOC  (at  most  one  change)  procedures.  First,  localized  procedures  reduce  computation. 
Second,  these  detecting  and  estimating  procedures  require  no  moment  condition,  instead 
we  only  assume*  that  observed  data  come  from  a  continuous  distribution  with  a  unique 
median.  , _ 

2.  MODEL  AND  DETECTING  AND  ESTIMATING  PROCEDURE 

Let  x(t)  be  an  independent  process  on  the  interval  (0,1]  whose  marginal 
distributions  differ  only  by  location  parameters.  Specifically  speaking,  there  exist  to  ,ti  , 
...  ,  tq+i  and  aj  ,...,  aq+1  such  that  0  =  to  <  ti  <  ...  <  tq  <  tq+i  =  1  and 

x(t)  ~  F(x-aj),  if  tj-i  <  t  =  tj,  j  =  l,2,...,q+l,  (2.1) 

with 

aj  ^  aj+i  for  j  =  l,...q.  (2.2) 

As  usual,  ti,...,tq  are  called  change  points.  In  practical  applications,  the  number  q 
and  the  locations  of  change  points  are  unknown.  To  estimate  q  and  ti,...tq,  we  sample  this 
process  sequentially  at  equal  distances,  and  get  x(l/n),  x(2/n),...  x(n/n).  By 

o  o  ©  o 

assumption,  x(l/n),...,x(n/n)  are  independent.  Define  1^=0,  kq+1  =  n,  and  kj  ,...,  kq 
as  follows: 

I  kj  (n)/n  -  tj  I  <  1/n,  i  =  l,...,q.  (2.3) 


Then  we  have 

x(i/n)  ~  F(x-aj),  if  kj _i(n)  <  i  =  kj  (n),  j  =  l,2,...,q+l,  (2.4) 

From  this  fact,  to  estimate  (ti,...,tq)  is  equivalent  to  estimating  (ki(n)/n,..., 
kq(n)/n).  For  simplicity,  setx(i/n)  =  xi  and  k°(n)  =  ki°.  Hereafter,  an  »  means 
Cln/Pn  ->co.  Let  m  =  mn  and  Cn  be  positive  integers  such  that  n  »  mn  »  Cn  »  log  n. 
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Set  Akm  =  {xk.m+1  ,-..,xk,  xk+1,..,  xk+m},  k  =  m . n-m,  kj  be  the  rank  of  xj  in  Ak,m, 

Define 


ykJ={  Iif  Xk'm*i  *  j  =  1 . ■». 


-1  otherwise 


m 

D,,  =  (k:  k  =  :  Sk  m/m  >  Cn), 

k1>n  =  min{k  :  k  €  Dn}, 
Dln={k:k€Dn,  k-ku<3m). 


(2.5) 


(2.6) 

(2.7) 


Next,  put 

k2ji  =  minik:k  €Dn-DUi) 

D2,n=tk:k6Dn-Du,k-k2^<3ni}. 

Continuing  this  process,  we  can  define  D2jn  J>2,n  •••»  wktch  are  easily  seen  to  be 
nonempty.  We  have 

Dn“Dl4i  +  D2*  +  ~  +  DV 


Define 

tj  =  2‘1[kj>n  +  max(k  :  k  e  Dj  n)],  j  =  l,2,...,q. 

We  have  the  following  theorem. 

THEOREM.  If  the  distribution  F  is  continuous  and  has  unique  median,  then 
(q,t  i, ...,**)  is  a  strongly  consistent  estimate  of  (q,ti,...,tq). 
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3.  LEMMAS 

To  prove  the  above  theorem ,  we  need  some  lemmas. 

LEMMA  1.  (Hoeffding,  1963)  Let  the  population  C  consists  of  N  values 
Cj,..., Cm.  Let  xit...,xn  denote  a  random  sample  without  replacement  from  C  and  let 
yi.---.yn  denote  a  random  sample  with  replacement  from  C.  If  the  function  f(x)  is 
continuous  and  convex  then 

Ef(I  xj)  =  Ef(  £  yi). 
i=l  i=l 


LEMMA  2.  Let  the  notation  be  defined  as  in  section  2  and  define 

flk,m  =  {xk-m+l.-»xk}.  k  =  m>*n+l . n.  (3.1) 

m 

^k,m=  %  yk-m+j.  (3.2) 

J=1 

5n  =  (k  :  3  j,  1  =  j  2  q  +  l.suchthatk^  +  1  ik-m+  1  <k  2  kj).  (3.3) 


Then,  we  have 

S^/m^Oflogn)  a.s. 

uniformly  for  all  k  6  fin. 

Proof.  Let  zk  lt...,zkm  be  a  random  sample  with  replacement  from  population 
{ 1,...,1,-1,...,-1 },  where  the  number  of  l's  and  -l's  are  both  m.  By  lemma  1  for  any  t  6 
(0,1/4)  and  A  >  0,  we  have 

P{Sk<in  =  A  Vmlogn)  =  exp ( -tAVmlogn } Ee  k,m 

m 

=  exp  { -tA  V  mlogn }  E  exp  { t  £ 

j=l 
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=  exp  { -tAV mlogn }  (E  exp  { tzk  j }  )m 
=  exp  { -tAV  mlogn }  ((el  +  e'l)/2)m 
=  exp  (-tAV  mlogn  +  mi2  eV2). 

Since  m  »  log  n,  it  is  possible  for  n  large  to  take  t  =  AV  Gogn)/m  <  1/4.  It  follows 
P(Skjn  =  AV  mlogn)  <  expf-O.SA^ogn).  (3.4) 

A  similar  argument  gives 

p(°’k,m=  -  AV  mlogn)  <  exp{-0.3A2logn}.  (3.5) 

The  inequalities  (3.3)  and  (3.4)  imply 

oo  _  CO 

£  P(  sup  ISk^nl  =  A V mlogn)  S  £  2n  exp{0.3A  logn). 
n=l  keB„  n=l 

This  is  a  convergence  series  if  0.3A2  >  2,  Take  A  =  3,  by  Borel-Cantelli  lemma,  with 
probability  one  for  large  n,  we  have  uniformly  for  all  k  €  Bn, 


ISk^nl  =  3  V mlogn, 

S l  /m  =  91ogn  a.s. 

uniformly  for  all  k  €  Bn. 

LEMMA  3.  Let  x*  denote  the  i-th  sample  order  statistic  in  Bk^n-  Let  [am]  denote 

1  »K 

the  integer  part  of  am.  Assume  that  the  a -quantile  of  the  continuous  distribution  F  is 
unique,  and  is  denoted  by  [l<x-  Then 

xIam]4c^^  +  aJ’  a-S- 

uniformly  for  all  k  6  Bn,  where  k°  ^  +  1  =  k  -  m  +  1  <  k  =  k? . 
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Proof.  Without  loss  of  generality,  we  assume  that  0  <  a  <  l.kf^  +  1  = 
k  -  m  +  1  <  k  =  k?  for  some  fixed  j  and  aj  =  0.  Write  r  =  [  am],  then  for  any  6  >  0, 

p<l’Wl*-^1  -  9  -  K*  =  ^  -  £) +  8  ^  -  0 

=  I|+l2. 


Set  F(pa-e)  =  a  -  8.  By  the  uniqueness  of  the  a  -  quantile  of  F,  6  >  0.  Without  loss  of 
generality,  we  can  assume  that  a  -  8  >  0.  Since  F(x)  is  continuous,  one  gets. 


Il=P(xrjkSpia-£) 


For  a  fixed  0  €(0,1),  put 


ge(t)  =  te(i-t)ie. 


It  is  easy  to  see  that  g(t)  is  increasing  in  t  on  the  intezval  [0,0],  wliich  in.p’ies  that 
t  r'1(l-t)m'r  is  increasing  for  t  e[0,  (r-l)/(m-l)].  Since  r  =  [  am],  we  get 
a  -  6  <  (r-l)/(m-l)  for  large  m.  Using  Stirling’s  formula,  we  obtain 

’>  -F^!<a-S)r'la-a+6)'n'r 

s  CinW2TTm  (a-6)f-1(l-a^g)m~r 
~  V2TT.2TT.r(m-r) '  (r/m)'((m-r)/m)™'' 


2CVm  q™ 


(3.6) 
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where  Ci  and  C  are  constants  depending  on  a, 6,  and 

qi  =  {(a-6)/(a-S/2)}a((1'a+6)/(l-a+6/2)}1‘a . 

=  ga(a-6)/ga(a-6/2)<l. 

Similarly,  one  can  show  that 

P(x*  k  5  p.  +  6)  ^  C  Vm  q™ ,  (3/7) 

where  0  <  q2  <  1.  Take  q  »  max(qi,q2),  then  (3.6),  (3.7)  and  m„  >>  log  n  togather  imply 

S  PCsup  I  *, ort*-  Ha  -  1  *  O  4  J, JcnVS  q  < 

n —  l  koBfj 

By  Borel-Cantelli  lemma,  it  follows  that 

W-^«  +  ai’  a-s‘ 

uniformly  for  all  k  €  Bn,  where  kj^  +  l  =  k  -  m  +  1  <k  =  kj. 

LEMMA  4.  Let  xi  €  Ak9iin,  j  =  L2,...,q,  Ska>m  be  as  defined  by  (2.6).  Then 
there  exists  a  positive  X  such  that  with  probability  one  for  n  large, 

S2„  /m  =  raX2,  j  =  l,...,q. 

kj,m 

Proof.  Let  x;  €  A  for  some  fixed  j.  Without  loss  of  generality,  we  assume 
k?,m 

aj+i  >  aj.  Take  X  6  (0,1/2)  satisfying  the  following  conditions: 

1«  the  (1/2- X)  and  (1/2+X)  -  quantiles  of  the  continuous  distribution  F(x)  are 

unique. 
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(3.8) 


(3.9) 


Proof  of  Theorem. 
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Let  kj  be  defined  by  (2.3).  For  fixed  j,  by  lemma  4,  we  have,  with  probability  one 


for  n  large, 


s\  /m  S  m  \2. 

k  ,m 

j 

By  the  definition  of  Dn,  k“  6  Dn,  j  =  Further,  by  the  definition  of  Dj  n's 

and  n  >>  m,  it  follows  that  with  probability  one  for  large  n,  k°,...,k°  belong  to  different 
Dj>n's,  which  in  turn  implies 

q=q  •  (4.1) 

with  probability  one  for  large  n.  On  the  other  hand,  from  Lemma  2  andCn>>logn, 

2 

it  follows  that  with  probability  one  for  n  large  m/m  <  Cn.  Further  take  0  <  £  <1/2 
and  write 


Kn={k:m  =  k  =  n  -  m,  Ik/n-  tjl  =  (m/n)(l+£)  for  j  =  l,...,q).  (4.2) 

Then,  since  n  >>  mn,  and  -4kim  consists  of  iid.  random  variables,  it  follows  that  with 
probability  one  for  n  large 


Dn  O  Kn  =  <J). 


(4.3) 


By  the  definition  of  Dj  n  and  the  fact  that  2m(l  +  S)  <  3m,  we  have  with  probability 
one  for  large  n 


A 

q 


=  q- 


(4.4) 


Combining  (4.1)  to  (4.4),  it  follows  that  q  =  q  with  probability  one  for  n  large, 


and 


lim  ti  =  ti 

n-»cx3  }  3 


j  =  l,2,...,q. 


Thus,  the  theorem  is  established. 


Our  procedure  can  also  be  applied  to  the  detection  of  change  points  of  scale 
parameters.  What  we  need  to  do  is  merely  to  consider  log  lx(t)l  instead  of  x(t).  • 

Likewisely  this  method  can  be  applied  to  the  detection  of  change  points  of  location 
parameters  in  a  circle,  i.e.  directional  data.  Lombard  (1986)  proposed  this  problem  and 
discussed  the  related  testing  and  estimating  problem  when  the  number  of  change  points  is 
known.  Our  technique  can  handle  the  case  in  which  this  number  is  unknown. 

For  directional  data,  fix  an  origin  and  an  axis  and  let  |x(t),  0  =  t  =  1 }  be  the  angle 

process,  where  0  S  x(t)  <  2tt.  Assume  that  0  =  to<  ti  <  ...  <  tq  <  tq+i  =  1,  ti . tq  are  the 

change  points  ,  so  that  x(t)  -  F(x-aj)  if  tj-i  <  t  <  tj,  j  =  1,...,  q+1,  where  aj  *  aj+i 

(mod  2TT) ,  j  =  l,...,q  ,  ait„  aq+i  denote  the  angular  location  parameters.  The  speciality  of 
the  angle  process  is  that  the  angle  is  always  in  10,2tt).  So  the  angle  of  a  +  (3  equals 
OC.  +  P  -  2TT  if  0  =  (1,(3  <  2tt  and  Ot  +  (3  S  2TT.  A  closer  look  at  our  proof  reveals  that  the 
speciality  has  an  effect  only  on  (3.10)  and  (3.11)  when  jJL(1/2+X)  +  aj  =  0  (mod  2tt)  or 

^(1/2- A)  +  3j+i  =  0  (mod  2TT)  for  some  j,  1  =  j  =  q.  Therefore,  if  continuous  distribution 
F(x)  has  a  unique  median  and  P4/2  +  aj  ^  0  (mod  2TT)  for  all  j,  1  =  j  =  q,  our  procedure 
can  be  used  to  detect  and  estimate  the  change  points  of  directional  data.  In  practical 
applications,  the  origin  and  axis  are  chosen  after  gathering  the  data,  in  a  way  to  make  the 
analysis  more  convenient 
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